home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Amiga Format CD 43
/
Amiga Format CD43 (1999)(Future Publishing)(GB)(Track 1 of 2)[!][issue 1999-09].iso
/
-serious-
/
misc
/
poolmem
/
developer.readme
next >
Wrap
Text File
|
1999-06-15
|
22KB
|
557 lines
Abstract:
This brief document describes how to allocate and deallocate memory correctly,
i.e. in a way compatible to the Os (and, as a result, compatible to PoolMem).
______________________________________________________________________________
The following rules apply to all programs that are supposed to run in an OS
friendly way. I didn't make them up myself. What you find here is more or
less a copy of the rules taken from the ROM Kernal reference manual, the
official Amiga developer documentation.
Breaking these rules will result in unstable programs, with or without
any additional memory tools. A program that seems to run fine without
PoolMem, but crashes with PoolMem is, nevertheless, unstable and might crash
in certain situations, even without this tool.
Allocation of memory:
o) The MEMF_PUBLIC bit:
Set the MEMF_PUBLIC bit (exec/memory.h). You usually want it!
NOT setting this bit results in memory that is
a) *private* to your task, i.e. can't be read from any other task
b) and can't be read safely within a Forbid()/Permit() or
Enable()/Disable() pair.
The current Os DOES NOT implement any checks for this rule, neither
does PoolMem. However, future memory managers might see this bit as
a hint to assign "virtual memory" to the allocation, i.e. memory that
can be swapped out to disk.
As an example, VMM requires correct usage of this bit.
All data that is supposed to hold Os structures MUST BE ALLOCATED
WITH THE MEMF_PUBLIC flag set, any memory that is passed to other
tasks, interrupts, exceptions, I/O buffers MUST BE ALLOCATED WITH
THIS BIT SET.
The only exception are private structures that are only read
or written to by your task, that are never passed nor read or
written to by other processes or Os functions and that are not
accessed with multitasking disabled.
o) Memory flushes:
Be prepared that a memory allocation might flush unused libraries,
fonts and devices from memory. In special, DO NOT USE CLOSED
RESORCES. Using a "FindName()" on the exec resource lists IS NOT
ENOUGH to use a resource.
If you DO NOT want that resources get flushed, set the
MEMF_NO_EXPUNGE flag as memory attribute. See exec/memory.h.
o) Memory and custom chips:
Memory that should be read by the Amiga custom chip set MUST BE
ALLOCATED with the MEMF_CHIP attribute set or the custom chips
won't be able to address this memory. That goes for:
o) display buffers (native bitmaps)
o) hardware copper lists (but not their gfx abstractions for the
CMove(),CWait() etc... family)
o) image bitmaps (struct IntuiImage->ImageData)
o) floppy hardware buffers (but since V37 not required for the
trackdisk.device I/O buffer)
o) hardware audio buffers
o) hardware sprites and image datas of "Bobs"
o) everything else the custom chip set might access
o) Order of memory blocks:
Do not make *any* assumptions about the order in which you get
memory. The second allocation is not necessarely the higher
address!
o) The MEMF_FAST bit:
Do not use the MEMF_FAST bit unnecessary if chip mem would be
O.K. for you, too. The operating system is smart enough to
allocate fast memory for you if that is available. It will fall
back to chip mem if fast mem is not available. There's usually
no reason to ask for fast mem explicitly.
o) Alignment:
It is guaranteed that all memory allocated by AllocMem() is aligned
to two long word boundaries, i.e. the bits 0 to 2 of the address will
always be zero. NOT MORE! If you need more alignment, see the kludge
below.
o) Size of buffers:
Make sure you allocate enough memory even for the worst case. A
C style string needs n+1 bytes memory to hold a string of length n.
Some Os functions require, due to bugs, a slightly larger buffer
than you might think, check the "BUGS" section of the autodocs.
(Mostly dos functions suffer from this bug, but some intuition
functions require this as well).
o) Memory attributes:
Do NOT set ANY undocumented bits for the memory attributes of
AllocMem(). They *might* be ignored for this version of the Os,
but probably won't the next version. Check the exec/memory.h
file for valid flags. As for the current (V40) version of the Os,
the following flags have been defined:
#define MEMF_ANY (0L) /* Any type of memory will do */
#define MEMF_PUBLIC (1L<<0) /* Damn important, see caveats above !*/
#define MEMF_CHIP (1L<<1) /* for custom chips */
#define MEMF_FAST (1L<<2) /* explicitly fast mem, see caveats! */
#define MEMF_LOCAL (1L<<8) /* Memory that does not go away at RESET */
#define MEMF_24BITDMA (1L<<9) /* DMAable memory within 24 bits of address */
#define MEMF_KICK (1L<<10) /* Memory that can be used for KickTags */
#define MEMF_CLEAR (1L<<16) /* AllocMem: NULL out area before return */
#define MEMF_LARGEST (1L<<17) /* AvailMem: return the largest chunk size */
#define MEMF_REVERSE (1L<<18) /* AllocMem: allocate from the top down */
#define MEMF_TOTAL (1L<<19) /* AvailMem: return total size of memory */
o) Memory contents:
Do NOT MAKE any asumption about the contents of the memory block
unless you specified the MEMF_CLEAR attribute to erase the memory
block. Not setting this bit is a bit faster, but results in a
memory block with whatever contents you might dream of.
o) Self modifying code:
Self modifying code should be avoided.
(What do thing this is? A C64? :-)
If you absolutely MUST play with this and can't go 'round this,
use the following Os call to flush the CPU caches once you've
placed your code in memory and need to run it:
ClearCacheU()
Do NOT expect that it is there BEFORE you called this routine.
This is even more important to routines like interrupts that are
called asynchroniously.
o) Failures:
Feel prepared that your memory request might fail. An explicit
check is REQUIRED after an AllocMem() call. Just "going guru" in
this case *IS NOT ENOUGH*. Print a warning message, abort your
program safely, CHECK YOUR CODE!
Assembly language authors: NO, IT'S NOT DOCUMENTED THAT AllocMem()
SETS THE ZERO BIT IF THE ALLOCATION FAILED. YOU'VE TO TEST THAT
YOURSELF.
If your calling task is indeed a process, OS versions V37 and
above guarantee to set the result code for IoErr() to
ERROR_NO_FREE_STORE (=103L).
o) Memory flushers:
The following is a safe memory flush:
AllocMem(0x7ffffff0,MEMF_PUBLIC);
(The flush used by the "avail flush" command).
o) AllocMem() and context switches:
Neither AllocMem() (nor FreeMem()) break a "Forbid" state. This is
important because it's the only way to "print" a list thru the
dos.library and other functions that is access protected
via Forbid().
The following code sequence is legal for this purpose, and should
stay legal:
- call Forbid() first,
- make a copy of that list element by element, using AllocMem()
- call Permit().
- print the copy of the list
- deallocate the copy.
Running into a Wait(), like using a semaphore for access protection
of the memory list memory would be fatal here.
o) AllocMem,FreeMem,AllocAbs and interrupts:
NONE of these functions can be called from interrupts or in the
supervisor mode.
Remember, however, that "input handlers" of the input.device
are not run as interrupts but in the context of the input.device
task, even though they are build on top of an interrupt structure.
Thus, calling AllocMem() here to make a copy of an input event IS
LEGAL.
_____________________________________________________________________________
Usage of stack for storing: (Or, how to allocate memory without allocating it)
It's a somewhat vague point whether the stack can be used for storing
system/Os structures or for passing structures to other tasks. The following
paragraph is my own interpretation of this technique and should be used with
some care:
The CURRENT Os implementation allows this technique. The stack is allocated
with the MEMF_PUBLIC flag set, AND MUST BE ALLOCATED THIS WAY. This is simply
due to the fact that the memory for the stack is allocated by the task that
creates a new process, and not by the new process itself. Since the AmigaOs
doesn't know the unix fork() style of creating new processes, this is the
only way of allocating the stack for the new process anyways. Thus, the stack
is kept in memory that is passed across task boundaries and must be,
therefore, public. Thus, it can be used for storing Os structures and for
passing data accross processes. It's furthermore common practice to use
the stack to pass "taglists" to Os functions that might be read by a
different process, and even to keep complete Os structures on the stack, as
done by some CBM shell commands, routines in the dos.library and others.
(However, see the note below about how strict CBM/AI read their own
design rules!)
HOWEVER, Ralph Babel writes in "The Amiga Guru Book" (2nd ed., 1993):
"The stack is private memory ... and should not be considered MEMF_CHIP
or MEMF_PUBLIC, nor should it be used for storing system structures or code.
The latter is important, since there is no guarantee that the stack is
aligned to an even address, as these processors also allow nonbyte data
acesses from any base address, although opcodes must still be word aligned."
I do not agree in this point with Ralph except that the stack is indeed
usually not MEMF_CHIP and shouldn't be considered to be. Storing code on
the stack is truely considered "higher magic" and should be avoided.
(Also see above for caveats IF YOU ABSOLUTELY HAVE to do this.)
However, I would suppose that stack memory is always MEMF_PUBLIC for
reasons stated above, and it's always word aligned since the MC68K keeps
track of this themselfes unless you really attempt to screw the stack up.
Normal usage of stack does not break this alignment as even a
move.b d0,-(a7)
instruction will decrement the stack pointer BY TWO BYTES, NOT BY ONE.
This is one of the lesser known features of the MC68K series, indeed, and
goes for all processors, from the MC68000 to the MC68060.
Citing Motorola's "Programmer's Reference Manual" M68000PM/AD Rev.1,
Page 2-28:
"To keep data on the system stack aligned for maximum efficiency, the active
stack pointer is automatically decremented or incremented by two for all
byte-size operands moved to or from the stack."
You should, however, still remember that you must align stack memory
to four byte boundaries by hand. The following code snipped shows how to
reserve 256 bytes of stack aligned to a longword boundary:
lea -$104(a7),a7 ;reserve 256 bytes plus 2+2 for alignment.
;we use the extra two bytes to keep
;a possible long word alignment of the
;stack and to avoid speed penalties for
;the more advanced processors. If you write
;your own routines, you should always allocate
;stack memory this way since the Os always
;generates tasks with the stack pointer
;aligned to four byte boundaries.
move.l a7,d0
addq.l #2,d0 ;round up
and.b #$fc,d0 ;to next four byte boundary
move.l d0,a0 ;pass pointer in a0
However, most C compilers are not smart enough to for this technique. Even
AI fall into that pithole when writing the "List" and "Dir" commands. Both
don't align DOS structures to long words correctly. (Urgh!) But since a
similar code sequence is used sucessfully by the "DoPkt()" routine
inside the dos library, I would still say that using stack for Os structures
is legal and continues to stay legal. Allocating each tiny structure from
the stack would create a huge overhead and would fragmentate memory a lot.
However, as I said, this is a somewhat vague point, you don't have to
agree with me and I'm open for a discussion.
_____________________________________________________________________________
o) Size of deallocation:
Deallocation size MUST MATCH ALLOCATION SIZE PRECISELY. It is
BY NO MEANS ALLOWED to
- round the size because the rounding algorithm of the operating
system might change in future to support special hardware
(e.g. PowerPC cache lines which are 32 bytes wide)
Now to another rule that hasn't been formulated in the RKRMs:
- free a partial memory block, i.e. parts of an array.
THIS IS DEFINITELY ILLEGAL, NO EXCEPTIONS, NO EXCUSES.
Free ALL OR NOTHING.
Freeing a partial part of a memory block requires knowledge of
the alignment rules of the Os and may break code if these rules
change in future versions.
I would therefore strongly recommend NOT to use this technique.
o) Access to deallocated memory:
Do not touch deallocated memory. If it's gone it's gone and you're
no longer allowed to use it, address it, read it or write data to
it. Another task might want it.
A tiny exception that hasn't been formulated in the RKRMs, but is
unfortunately widely used:
Deallocation of memory WITHIN a Forbid()/Permit() pair. The memory,
EXCEPT FOR THE FIRST EIGHT BYTES WHICH ARE USED FOR ADMINISTRATION,
is guaranteed to stay unmodified and ready for use as long as
the multitasking is disabled. Running into a Wait(), directly or
indirectly, will break the Forbid() state and will therefore make
the memory unusable.
Be warned! Even though this access is sort of legal, hence
tolerated by MungWall, MemSniff, PoolMem and others, it's IMHO
still ugly and therefore highly discouraged. One of the very few
exceptions where this feature might be helpful is the following
code segment that unloads the segment of a load- and stay resident
program:
move.l SysBase(a4),a6
jsr _LVOForbid(a6)
move.l DOSBase(a4),a6
move.l Segment(a4),d1
jsr _LVOUnloadSeg(a6) ;Unload own code
move.l a6,a1 ;THIS CODE STAYS LEGAL because
move.l SysBase(a4),a6 ;of the Forbid()
jsr CloseLibrary(a6) ;close dos
moveq #0,d0
rts ;exit.
Note that you must definitely positively sure that the segment
is not an overlayed segment because UnloadSeg() WILL break the
Forbid() state in this case. However, this doesn't work for
load- and stay-resident programs anyways.
o) Chip memory and blitter access.
The custom "blitter logic" uses DMA and accesses the chip memory
independent of the CPU. If you use a temporary buffer for the blitter,
make sure the blitter does no longer access this buffer before you
deallocate it. To be on the safe side, call WaitBlit() before de-
allocating memory that has been used as blitter buffer.
o) Memory and hardware DMA access.
Modern hard disk interfaces might access memory by DMA, parallel
to the CPU. If you're planning to use this hardware DMA directly
because you're writing a device driver for this hardware, be
prepared to flush the CPU caches properly. Especially, call
CachePreDMA(...)
prior the DMA operation
CachePostDMA(...)
afterwards.
Check the autodocs for details about these functions and their
parameters.
o) Return value:
FreeMem() DOES NOT return any useful value, nor does it set any
condition codes.
______________________________________________________________________________
AllocAbs and other wierdos:
AllocAbs is for specialized usage of allocating memory from a predefined
location. DO NOT USE IT WITHOUT GOOD REASON.
o) Range of allocated memory:
AllocAbs performs some rounding. Be prepared that the memory block
you get is not identical to the memory block you requested.
However, IF the memory allocation could be satisfied, the requested
memory block is guaranteed to be contained in the returned memory
block.
Feel prepared that the memory request cannot be satisfied because
the requested memory is already in use by a different task.
AllocAbs() returns NULL in this case. You've to check for this
explicitly! It does NOT set any condition codes.
AllocAbs() WILL NOT set the ERROR_NO_FREE_STORE return code for
IoErr().
o) Contents of allocated memory:
Do not make any asumptions about the contents of the
allocated memory block. The OS uses parts of the free memory blocks
for administratory purposes and might have been trashed parts of
memory block.
That means especially for reset resident programs - whose memory is
allocated this way by the exec KickMemPtr mechanism - that the
first eight bytes will be trashed. Be prepared for that feature!
o) Deallocation of AllocAbs()-ed memory:
To be sure that the allocated memory is really deallocated completely,
call FreeMem with the memory address and size you REQUESTED, NOT
with the return value of AllocAbs(). This might sound strange indeed,
but the FreeMem() logic performs the same rounding of size and address
as the AllocAbs() logic. If, however, you pass in a different address,
as the return value instead of the requested address, it is not
guaranteed that really all memory is deallocated.
A tiny example might be helpful (asuming the the current rounding
algorithm):
AllocAbs(0x07,0x300007);
allocates 16 bytes and returns 0x300000. Calling now
FreeMem(0x300000,0x07);
will only free EIGHT bytes starting from 0x300000 instead of
16 bytes. However,
FreeMem(0x300007,0x07);
will work as required.
I'm sorry to say that the kludge documented in the last revision
of this file failed for the same reason; this has been fixed.
o) Using AllocAbs() for aligned memory allocation:
The following code segment is a kludge for allocating memory aligned
to a boundary:
void *AllocAligned(ULONG bytesize,ULONG attributes,ULONG alignment)
{
UBYTE *mem,*res;
alignment--;
if (mem=AllocMem(bytesize+alignment,attributes & (~MEMF_CLEAR))) {
Forbid();
FreeMem(mem,bytesize+alignment);
mem = (mem + alignment)&(~alignment);
res = AllocAbs(bytesize,mem);
Permit();
if (res) {
if (attributes & MEMF_CLEAR)
memset(mem,0,bytesize);
} else mem = NULL;
}
return mem;
}
I.e, call this routine with "aligment" set to 16 for an alignment
to a sixteen byte boundary.
Calling this routine with anything but a power of two for the
alignment doesn't make much sense and is illegal.
Note that the memory is cleared MANUALLY if MEMF_CLEAR is set.
This MUST be done since AllocAbs() does not guarantee the
contents of the memory, even if the former AllocMem() already
cleared the memory.
_____________________________________________________________________________
Any program that obeys these rules won't have *any* problems with PoolMem!
_____________________________________________________________________________
Debugging tools (memory related):
The following two debugging tools are "official" AI tools and should be used
by any serious developer:
-Enforcer: Detects memory accesses to the vector base and to unmapped
memory regions.
-MungWall: Detects a lot of illegal accesses as in the list above,
as failing to initialize memory properly, accessing de-
allocated memory and others. However, it *could* do more.
-SegTracker: Keeps program names together with their loaded segments
for easy identification of code.
Even a program that runs without problems with these tools is not
necessarely bug free!
The use of the following debugging tools is highly recommended:
-PatchWork: (by Richard Körber)
Detects invalid parameters to Os calls.
I would also recommend the following combination: Since this is my own
stuff, I can't be very objective. You might want to check them out....
- COP: (my own stuff)
Catches gurus and exceptions "on line" for straight
forewards debugging.
- MemSniff: Even pickier than MungWall. It detects software failures
and memory problems MungWall can't find. However, check
the documentation as this tool has its special "caveats".
It should be used in conjunction with COP since it doesn't
generate an as complete output as MungWall or Enforcer.
- SaferPatches: Detects illegal function patches. If this one crashes with
a guru, something is wrong. For details, check the doc
of the SaferPatches archive.
_____________________________________________________________________________
Another set of wierdos for the "enlighted". (-:
The following is a list of "OS features" you should be aware of if you
consider writing your own memory tool. I found them when writing PoolMem,
so they are here for your information. However, DO NOT USE THESE TECHNIQUES
in own code.
Even though the above rules have been setup for the developer, that doesn't
mean that the Os respects these rules ("Quod licet Iovi non licet bovi.").
I found the following "OS features":
- The FFS (all versions V37 thru V43) expect a return value of "-1" for
FreeMem(). This has been fixed for release 43.20.
PoolMem contains a kludge for that. MemSniff and MungWall will mess up
the registers on purpose. The result code for the FreeMem() can be set
with FREEMEMRESULT. The default value is -1 to fix the earlier FFS releases.
Other programs, as for example the "RexxPlus" compiler, requires different
result codes to work properly, the default value "-1" conflicts with a bug
in the compiled code. Setting the result code to "-2" might help in these
cases.
- The layers.libray allocates memory in large blocks, but deallocates this
large block of memory in a series of small deallocations. In other words,
it breaks up large memory blocks in smaller ones.
The current version of PoolMem respects this behaviour, not only for the
layers.library. MungWall and MemSniff include special kludges to allow this
EXCLUSIVELY for the layers.library. However, THIS HAS BEEN ILLEGAL, IS
ILLEGAL AND WILL CONTINUE TO BE ILLEGAL. I hope that this mess will be
cleaned up in a future Os revision.
- Some programs expect the Z (zero) flag of the CPU after an AllocMem()
call to be set on failure and to be cleared if the allocation worked.
THIS IS UNDOCUMENTED.
The current version of PoolMem contains a kludge to make these programs
working.
- Some Os functions allocate Os structures from the stack and pass them
to other tasks. (The DoPkt() routine is one example, but there are others).
I would still say that this is O.K., but if you don't want to follow me in
this point, it's an Os bug.
_____________________________________________________________________________
Thomas Richter, November 1998